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1^ ' The standard method for the propagation of errors, based on a Taylor series 

expansion, is approximate and frequently inadequate for realistic problems. 

k> , A simple and generic technique is described in which the likelihood is con- 

H ' structed numerically, thereby greatly facilitating the propagation of errors. 
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1 Introduction 

Traditionally, errors on derived quantities have been determined analytically by performing a Taylor 
series expansion about the central values. This method is adequate for many simple problems but 
often fails in more realistic situations for a number of different reasons, as described in section y. 

In section |^ we describe a rigorous and generally applicable numerical technique for the propagation 
of errors, based on the numerical construction of the likelihood for the derived quantity using pseudo- 
random numbers. Sections |^ and || illustrate the use of this technique with practical examples from 
High Energy Physics analyses. 

While the method proposed has almost certainly been used by others, it has received little attention 
in the literature. This is perhaps due to the fact that only in the last few years has it been practical to 
generate large numbers of pseudo-random numbers in order to solve a problem which was traditionally 
done by hand. 

2 Analytic Method for the Propagation of Errors 

In general, physical quantities are not known with infinite precision but are described in terms of 
likelihood distributions (also called probability or error distributions). To be specific, consider a 
quantity / which depends on some quantities Xj (i = 1, . . . , n) in a known way. The uncertainty on /, 
which depends on the uncertainties 6xi on the measured values x^, is usually derived by the standard 
method for the propagation of errors which relies on a Taylor series expansion of /: 



f{x^ + 6x,)=f{xf)+Y,6xi-L 



+ ••• (1) 



where the ellipsis denotes higher order terms in 6xi . These are neglected to yield the familiar expression 
for the variance of /: 
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where cjjj denotes the error matrix. For the results of this method to be valid, a number of requirements 
should be satisfied. 

Requirement 1: The likelihood distributions of Xj are Gaussian. 

The standard method for the propagation of errors implicitly assumes that the errors on the xi are 
Gaussian. In general, however, a measured quantity is described by an arbitrarily shaped likelihood 
distribution. Statistical and systematic errors on quantities Xi may be according to other common 
likelihood distributions (Poisson, binomial, etc.), or another function which may itself depend on 
measured quantities with associated errors. 

Requirement 2: The likelihood distribution of / is Gaussian. 

The standard method for propagation of errors yields a single number af which is interpreted as the 
RMS of the Gaussian likelihood distribution of /. It is often the case that the likelihood distribution 
of / is not Gaussian, nor even symmetric, even if all the errors on Xi are Gaussian. In such cases 
the interpretation of aj is unclear. A familiar example occurs in tracking detectors where the inverse 
transverse momentum, 1/pt, has a Gaussian error whereas the derived quantity, /(1/pt) =Pt, does 
not and is manifestly asymmetric. 



Requirement 3: The required derivatives are calculable. 

The evaluation of the derivatives, df /dxi, may prove to be difficult in practice if / is a complicated 
function of many parameters. If the dependence of / on Xi were determined numerically, for example 
by running a physics Monte Carlo program, it may be difficult or impossible to determine the required 
derivatives. Moreover, the derivatives may themselves have significant uncertainties due to limited 
Monte Carlo statistics and systematic errors from uncertainties in the values of physical parameters 
used by the Monte Carlo program. 

Requirement 4: Higher terms of the Taylor expansion are negligible. 

It is assumed that quadratic and even higher order terms in the Taylor expansion of Eq. Ill are negligible. 
This approximation is not always valid, for example if the first derivatives of / at Xi are zero or small 
compared to higher derivatives or if the errors are not small relative to the Xj. 

3 Numerical Method for the Propagation of Errors 

If one or more of the above conditions is not satisfied, we advocate the use of a numerical method 
for the propagation of errors, as described below. Even if all the conditions are satisfied in a given 
situation, the numerical method provides a valuable cross-check. Moreover, if the situation becomes 
progressively more complicated as more sources of error are identified, it is easy to extend the numerical 
method, whereas the standard method may ultimately fail one of the above assumptions. 

The numerical method is based on the familiar statistical concept of performing many hypothetical 
measurements of the same quantity and determining the error on that quantity from the spread of the 
values. It may be summarised as follows: 

Consider a function f{xi) where the independent variables Xi have uncertainties which are 
described by likelihood functions L{xi). The corresponding likelihood distribution L[f) may 
be described with arbitrary precision by a sufficiently large set of values, {f{x'^)}, where the 
input set of values, {x'j}, are chosen randomly according to the L{xi). 

While the standard method for the propagation of errors based on the Taylor expansion of Eq. |l] 
assumes a single 5xi, which is equated to the standard deviation of a Gaussian, the numerical method 
maps the full spectrum of errors 5xi and therefore does not require it to be Gaussian. Moreover, no 
higher order terms are neglected. 

Practically one implements the numerical method as follows: 

1. Define the likelihood functions, L{xi), for the independent variables, Xj. 

For example, in the case of uncorrelated Gaussian errors one would specify a set of central values 
and standard deviations. For correlated Gaussian errors one would specify a set of central 
values and a covariance matrix. In a more general case one might define the likelihood as a 
multi-dimensional function represented by a smooth parametrisation, a binned multi-dimensional 
histogram, or as a set of discrete values derived, for example, from a Monte Carlo program. 

2. Repeat the following steps, (a) and (b), n times, where n is a large number: 

(a) Choose a set of values {x'j} randomly according to the likelihood functions, L{xi). 

For example, in the case of uncorrelated Gaussian errors one might use the CERNLIB ||l| 
Fortran subroutine RANNOR. For correlated Gaussian errors one could use the NAGLIB pi 
Fortran subroutine G05EAF or a suitably transformed set of uncorrelated random numbers [H] . 



(b) Evaluate the function f{x[) and store the result, for example in a histogram or an array. 

3. Normalise the integral of L{f) to unity. 

4. Ensure that n is sufficiently large, for example by verifying that L{f) does not change significantly 
for different starting random number seeds. 

Care should be taken to avoid numerical problems associated with the computing hardware and 
software used y, page 659]. For example, if n is large then the accumulation of f{x[) values in a 
histogram may be subject to numerical imprecisions, or a poorly designed random number generator 
may start to show periodic behaviour. The optimisers of some compilers may even treat multiple 
invocations of a random number function, for example in a Fortran DO loop, as a single invocation 
and move it outside the loop thereby producing erroneous results. Caveats such as these, however, 
apply to most software and can be avoided easily by minimal precautions such as the use of extended 
precision variables and low levels of compiler optimisation. 

4 Example: Measurement of the W Boson Mass 

W mass measurements from hadron colliders [^ are typically based on fits to distributions of transverse 



mass, rriT = \/2pj,pj^{l — coscf)^^), where py and p^ denote the transverse momenta of the charged 
lepton (e or /i) and the neutrino respectively coming from the leptonic W decay, and (f)^^ denotes 
the angle between the charged lepton and the neutrino in the transverse plane. The charged lepton 
is typically measured in a tracking chamber with an error which is approximately Gaussian in 1/pj- 
while p^, which is inferred from the missing energy, has an approximately Gaussian error. The opening 
angle error is typically relatively small. 

Consider a single typical event with pj^ = 40 GeV, pj. = 35 GeV, (j) = 2 rad, which corresponds to 
mT = 63.0 GeV. Assuming that the errors on pj, and pj- are uncorrelated, then the standard method 
for propagation of errors using Eq. ^ yields 
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For example, given a^^i i = 0.005 GeV~^ and dpi' /pj^ = 10% yields amr = 7.0 GeV. 

The likelihood distribution of tjit is not in reality a Gaussian. Fig. |^ (solid line) shows the rriT 
likelihood distribution obtained by propagating the uncertainties on pj, and pi^ numerically. A clear 
asymmetry is seen, in contrast to the naive method for propagation of errors which, by construction, 
yields a Gaussian likelihood distribution (dashed line). The shift in the peak is ~ 2 GeV which is 
significant compared to typical errors on the W mass of 0(0. 1) GeV. 

5 Example: r neutrino mass and mixing 

Constraints on the tau neutrino mass and its mixing with a hypothetical fourth lepton generation 
have been derived Q by considering the dependence of tau branching fractions on the mass rrii,^ 
and the Cabibbo-like mixing angle 6l (or more naturally sin^^/,). The theoretical predictions are 
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Figure 1: Example likelihood distribution of transverse mass showing the difference between the 
numerical and analytic approaches described in the text. Both curves are normalised to unit area. 



compared with the experimental measurements for the following decays]^: r~ — > e~r'eUr, t~ -^ fi~r'^Ur, 
T~ — > 7r~i'r, and t~ -^ K~i/^. 

The experimental measurements of the branching ratios B^^^ [i = e, /n, tt, K) are uncorrelated and 
have Gaussian errors. If the theoretical predictions, B^ '^°'^-^, were perfectly known, then the likelihood 
for a particular choice of mj^^ and sin^ 6^ would beg 



L(m,^, sin2 eL\Br\B';:^^\ B^-^\ B^^^') = [] ^^,^^P (" (^ 
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where the o", are the errors on the i?^^'' . The predictions for B^ *^°'^'^, however, depend in turn on 
experimentally measured quantities with errors. 

The theoretical predictions for the branching fractions Bi for the decays t~ -^ i~U£L'T-, with i = e/^u, 
are given by M: 
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'^Henceforth we denote the branching ratios for these processes as Be, B^, Btt, Bk respectively; Bi denotes either Be 
or Bfj, while Bh denotes either B-^ or Bk- 

^The CLEO measurement of the r mass was used to further constrain m^^. From an analysis of t'^t~ — + {-k'^ mr'^ iz-r) 
{■K~ rmv^'' Ut) events (with n < 2,m < 2,1 < n + m < 3), CLEO determined the r mass to be ni-r — (1777.8 ± 0.7 ± 
1.7) + [m^^{MeV)]^ /lAOOMeV ||]. The Ukelihood for the CLEO and BBS measurements to agree, as a function of m^^ 
is included in the global likelihood. This does not affect the conchisions of this section but merely reinforces them. 



where Gp is the Fermi constant, m-r and r^ are the r mass and lifetime, x = mf/rn^, mi is the charged 
lepton mass, y = ml^/m^, rrn^ is the W mass, a{mr) is the renormahsed fine-structure constant at the 
T mass scale, and each ehipsis denotes neglected higher order terms. The first term in brackets allows 
for phase-space while the second term in brackets allows for radiative corrections [§-|ll|. Similarly, 
the branching fractions for the decays r~ — > \\~Vt-, with h = vr/K, are given by 
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where x = m^/m. 



^, TTT-h is the hadron mass, y = m1^/m^, /h are the hadronic form factors, and Vap 
are the CKM matrix elements, V^d and V^s, for vr" and K~ respectively. The ellipsis represents terms, 
estimated to be ©(±0.01) |I^, which are neither explicitly treated nor implicitly absorbed into Gp, 
/TriKdl, or /k|Vus|. 

The uncertainties on the B^ '^^-^ depend on the errors on the values of: Gp, t,-, nie, m^, niT^, rriK, 
mW) and mz fl^; rrir from the BES measurement at threshold 10; /7r|Kid| and /kIKisI |12, and 
references therein] ; and the estimated theoretical uncertainty due to (neglected) higher order radiative 
corrections. 

If the standard method for the propagation of errors were to be applied, one would calculate the 
theoretical errors according to Eq. |2[ by differentiation of Eqs. |^ and |8|, and add them in quadrature to 
the experimental errors on the Bi to obtain ctj. This approach is problematic for a number of reasons: 

• the input errors are not necessarily Gaussian (for example the theoretical uncertainty on the 
neglected higher order terms); 

• the uncertainty on B- ^°^^ is non-Gaussian, as may be seen immediately from just the m^ and 



m^ dependences of B^ ^""^^ and B^ 
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respectively; 



• many rather lengthy derivative calculations are required; 

• there is no a priori guarantee that the neglect of higher order terms in the Taylor expansion is 
a reasonable approximation; 

• the four S^ °°'^^ predictions depend on many common input parameters such that the four ai 
cannot be treated as independent errors. 

The numerical procedure described in section y avoids all of these problems. A large ensemble 
of values of B- '^™^ is created by choosing values for Gf, t^, etc. according to their errors and 
then evaluating Bl^^'^'y, B*^^°^y, ^^heory^ ^^^ B^^""'^ according to Eqs. | and |. The likelihood is 
calculated according to Eq. |^ with Uj taken to be the error on B^^^ ' only. The full likelihood, allowing 
for the errors on B^^^ ' , B- '^"^^ , and all correlations is then obtained from the normalised sum of the 
likelihoods for the full ensemble. To be specific, the following steps are carried out: 

1. define the likelihood functions, L(xj), for the independent variables, xf, 

2. create a 2D histogram of m,^^ vs. sin^ 9i,; 

3. repeat the following steps, (a) and (b), n times: 



(a) choose a set of values {x^} randomly according to the likelihood functions, L{xi). 

(b) for each bin in the histogram choose rriu^ and sin ^l at the centre of the bin and then 
evaluate L(m^^,sin2 ^il^f ?*, 5^^P*,5f p*, S^^*), according to Eqs. |, 0, and |, and add 
the value of L to the contents of the bin; 

4. normalise the histogram to unity to obtain L{mu^, sin^ 9 l). 

Figure 0(a) shows the contours of the two dimensional likelihood distribution, L{mu^ , sin^ 9 l) which 
correspond to the 90% and 95% confidence levels. No evidence is seen for a non-zero neutrino mass, 
nor for mixing. By integration of the two dimensional likelihood over all values of sin^ 0l we obtain 
the one-dimensional likelihood for m,^^, as shown by the solid line of figure |2|(b) , which yields upper 
limits of rriur < 42(48) MeV at the 90(95)% confidence levels. The solid line of figure |2|(c) shows the 
one-dimensional likelihood distribution for sin^^L; integrated over all values of m^^, from which we 
derive the upper limits: sin'^ ^l < 0.014(0.017) at the 90(95)% confidence levels. 

6 Conclusions 

The standard method for the propagation of errors, based on a Taylor series expansion, is approximate 
and frequently inadequate for realistic problems. In particular, it assumes that the errors on the 
independent quantities are Gaussian, that the error on the derived quantity is Gaussian, that the 
required derivatives are calculable, and that higher order terms in the Taylor expansion are negligible. 

A numerical method for the propagation of errors is described which makes no such assumptions, 
provides exact results with arbitrary precision, and is straightforward to implement even for compli- 
cated problems. 

Realistic examples illustrating this numerical technique are described. The interpretation of con- 
straints on neutrino masses, with either a Classical or a Bayesian approach [f^, has received much 



attention in the literature |13,1^. In the example described herein, a flat Bayesian prior distribution 
has been implicitly assumed by sampling m^^ and sin ^l from a histogram with uniform bins. Simi- 
larly, negative neutrino masses are excluded by the choice of the histogram range. While such choices 
are a matter of discussion, it should be emphasised that they are in no way imposed by the use of the 
numerical algorithm for the propagation of errors, which is in fact generally applicable. 
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Figure 2: Likelihood distributions for all r decay channels combined, for (a) sin ^l vs. my^, (b) niy^, 
integrated over sin^^Lj and (c) sin^^L, integrated over rriy^. 
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